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The general purpose of this study was to compare the 
differences in middle school students' mathematics achievement, their 
changes in attitude towards mathematics, and their attitude towards 
evaluation when evaluated with two different measurement strategies. 
The primary purpose of the study was to compare aspects of 
criterion-referenced and norm-referenced evaluation within selected 
sixth and seventh grade mathematics classes at the University of 
Northern Colorado Laboratory School . The design foi this 
investigation was quasi-experimental nonequivalent control group 
design. Ninety-five students were assessed in regard to mathematics 
achievement, attitude towards mathematics, and attitude towards 
elevation at the beginning and again at the termination of the 
12-week trimester. Overall, students obtained higher achievement 
scores when evaluated using a criterion-referenced method keyed to 
the specific performance objectives of an individualized 
instructional program. Students evaluated by criterion-referenced 
methods demonstrated significantly more positive attitudes towards 
the subject than did those middle school students evaluated by 
norm-referenced methods. Middle school students indicated no 
preference for either the criterion-referenced or the norm-referenced 
methods of evaluation. (Author/PN) 
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Abstract 

The general purpose of this study was to compare the differences in 
middle school students' mathematics achievement, their changes in attitude 
towards mathematics and their attitude towards evaluation when evaluated 
with two different measurement strategies. The primary purpose of the 
study was to compare aspects of criterion-referenced and norm-referenced 
evaluation within selected sixth and seventh grade mathematics classes at 
the University of Northern Colorado Laboratory School. This study was con- 
cerned with three specific questions: (1) Is there a difference in mathe- 
matics achievement of students evaluated by criterion-referenced methods 
and norm-referenced methods? (2) Is there a difference in the attitude of 
students evaluated by criterion-referenced methods and norm-referenced methods 
towards mathematics? (3) Is there a difference in the attitude of students 
evaluated by criterion-referenced methods and norm-referenced methods towards 
evaluation? The design for this investigation was quasi -experimental non- 
equivalent control group design. The population for this study was provided 
by the Laboratory School of the University of Northern Colorado. Those 
students in the Middle School's sixth and seventh grade class were normally 
divided into four general mathematics classes by their daily class schedule. 
Two classes, one sixth grade and one seventh grade, were randomly selected 
as experimental groups, leaving the two remaining classes as control groups. 
The purpose of these classes was to explore general mathematics topics. 
Prior to the experiment the researcher and the participating teacher developed 
specific performance objectives so designed as to outline three four-week 
instructional units for all groups. The content of both the control and 
experimental groups was the same for all similar groups. All four classes 
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were taught by the same instructor. Three instruments were used to 
generate pretest and posttest scores for comparison. The researcher 
and the participating teacher developed criterion-referenced tests of 
mathematics achievement keyed to the instructional performance objec- 
tives to measure the participants 1 mathematical progress. The researcher 
developed a Likert-type (equal -appearing interval scale) attitude sale 
to register the extent of the participants 1 agreement or disagreement 
with a set of predetermined basic concepts concerning criterion-referenced 
and norm-referenced evaluation. The researchers also developed selected 
a Likert-type attitude instrument .in order to measure the participants' 
attitude towards mathematics. Participants included ninety-five sixth 
and seventh grade middle school students at the University of Northern 
Colorado Laboratory School during the spring tri-semester of 1975. All 
participants were assessed in regard to mathematics achievement, attitude 
towards mathematics and attitude towards elevation at the beginning and 
again at the termination of the twelve-week tri-semester, The findings 
in this experiment led to the following conclusion: Overall, the achieve- 
ment of middle school students is significantly affected by the method 
used for evaluation in the instructional process. (2) Higher achieving 
middle school students are less affected by the method evaluation used in 
the instructional process than are lower achieving students. (3) Lower 
achieving middle school students are affected more by the method of eval- 
uation used in the instructional process than are higher achieving students. 
(4) Overall, middle school students obtained higher achievement scores when 
evaluated using a criterion-referenced method keyed to the specific perfor- 
mance objectives of an individualized instructional program. 



(5) A middle school student's attitude towards a subject or course of 
study is significantly affected by the type of evaluation system used. 

(6) Overall, middle school students evaluated by criterion-referenced 
methods demonstrated significantly more positive attitudes towards the 
subject than did those middle school students evaluated by norm-referenced 
methods. (7) Middle school students indicated no preference for either 
the criterion-referenced or the norm-re f erenced methods of evaluation. 
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learners, even if the differences are trivial in terms 
of subject matter.... 

In any group of students we expect to have some small per- 
cent receive an "A" grade. We are surprised when the per- 
centage differs greatly from about ten percent. We are 
also prepared to fail and equal proportion of students. 
Quite frequently, this failure is determined by rank order 
of the students in the groups rather than by their failure 
to grasp the essential ideas of the course. 

Statement of the Problem 
Do teacher grading methods effect pupil achievement? For genera- 
tions we have accepted the following three stage instructional model. 



Insert Figure I about here 



The fact that we believe in a high correlation between teaching 
and learning is supported by the overwhelming volume of research on 
teaching media and methodology. A consistent body of research exists 
concerning the ways in which pupils learn. But what is the effect 
of the evaluation process on pupil achievement? Is pupil evaluation 
an unrelated assessment process or is it an integral part of the teach- 
ing, learning model that may not assess instruction but have effect 
upon the success of that instruction? 

This research activity was based on the concept that pupil evaluation 
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is an integral part of the teaching-learning process and therefore may 
affect pupil achievement. In addition, all other teaching-learning 
activities such as methods, materials and other student related activities 
must be planned with respect to the pupil evaluation methods used. 

Description of Procedures and Design 
This experiment was conducted at the University of Northern 
Colorado Laboratory School at Greeley, Colorado. The school maintains 
a K-12 program as a department of the College of Education. It is 
designed to provide a facility for research and experimentation with 
new teaching methods and to offer preteaching experiences for the 
college's professional teacher education programs. Approximately 600 
students are enrolled on a first-come, first-served basis. The popu- 
lation selected for this study was the 97 students enrolled in the 

r 

middle school's sixth and seventh grade mathematics classes. These 
classes were designed to explore topics of general mathematics and the 
subject matter content of the courses was structured so that all similar 
student groups were exposed to the same established curriculum. 

At the beginning fo the 1974-75 school year the sixth and seventh 
grade students were randomly assigned to an A.M. or P.M. class, and 
all four classes were taught by the same instructor. The data for their 
experiment was collected during the three twelve-week trimesters. 



Insert Figure II about here 
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The mathematics classes of the middle school program were divided 
into two sections for each grade by their normal school schedule, thus 
providing four sections for the experiment. One section of each grade 
was randomly chosen by the toss of a coin to be the experimental group 
while the other group remai^d as the control group. The experimental 
group was to be evaluated by using a criterion-referenced evaluation 
method while the control group used a norm- referenced method. 

Prior to the experiment the participants were administered three 
pretest instruments. First, a Mathematics Achievement Test, designed 
by the researchers containing randomly selected test items matched 
with the performance objectives developed for the experiment was given 
to each of the four classes. Secondly, all participants were administered 
a Linkert-type Grading Attitude Questionnaire containing items designed 
by the researchers to determine if the participants indicated any preference 
for norm-referenced or criterion evaluative methods. Finally, all 
participants were administered a student Attitude-Questionnaire 
developed for use with middle school students by the researchers to 
assess the participants attitude toward mathematics. An analysis of 
pretest data indicated no significant difference among the four mathematics 
groups. At the conclusion of the pretest period all groups began a 
series of four instructional units. The objectives of the units were design- 
ed so that all similar groups covered the same content. The instructor 
for all groups was the same, and every attempt was made to ensure that the 
only controlable difference between the two groups was the method of pupil 
evaluation. 
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Experimental Treatment 
The experimental group (criterion-referenced group) was evaluated 
at the conclusion of each instructional unit (approximately every four 
weeks) using an instructional unit criterion-referenced mathematics 
test developed^ by the researchers that was keyed to the specific performance 
objectives of the instructional unit. Each student's grade was determined 
by comparing his/her score to a predetermined "performance standard. In 
addition, recognition was given for exceeding the basic performance 
standard or basic skill level. For purposes of continuity with the 
control group letter grades were awarded. to the experimental group using 
a letter code developed previously by Emmert and Wilburn (1974). Basically, 
this code was as follows: 



Insert Grading Code Here 



Control Group 

The control group (norm- referenced) was evaluated at the conclusion 
of each instructional unit using an instructional unit achievement test 
identical to the experimental group. However, each student's grade was 
not determined by comparing his/her score tO**a predetermined performance 
standard, but rather to each member of the group. 'Grades were given so as 
to outline a normal distribution. The traditional A,B,C, letter code was 
used in accordance with the distribution as outlined in Figure 3. 
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Insert Figure- III about here 



The experiment continued in this manner through four instructional 
units over a period of twelve weeks. At the conclusion of the experi- 
ment all participants were administered the Mathematics Achievement Post- 
test, Grading Attitude Questionnaire and Student Attitude Questionnaire. 

Statistical Analysis 

The statistical model for this experiment utilized a 2 x 4 factorial 
design. The students were categorized by pretest achievement test 
scores and sex. Three analyses were made; one for each independent 
variable (achievement, methematics attitude, and attitude toward grading) 
by comparing means of the groups on pretest and posttest scores. This 
statistical model is outlined in Figure IV. 



Insert Figure IV about here* 



The statistical procedure utilized in this process was factoral 
analysis of variance as conducted in the Biomedical 05V analysis of 
variance to look at differences in mean pretest and posttest scores 
in regard to each independent variable and each possible interaction 
factor. The final statistical test for significance was the two tailed 
F tes*~at the .05 significance level. 
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Analysis and Summa r y of the Data 

Mathematics Achievement 

m 

Mathematics achievement scores were obtained from the Mathematics 
Achievement Test administered before and after the experiment. The re- 
suit of the analysis of variance of the mathematics achievement gain 
scores with grade as a covariance appears in Table 1. 



Insert Table 1 about here 



The analysis of variance of the mathematics achievement gain scores pro- 
duced significant results concerning pretest level effect, achievement 
by grade interaction, and level by sex interaction. The differences 
between means of gain scores of the pre-test level effect are presented 
-in Table 2. 



Insert Table 2 about here 



The results of this analysis indicated that there is a significant differ- 
ence in achievement between students with low pretest scores depending 
upon whether they were evaluated by norm-referenced or criterion-referenced 
methods. This difference is illustrated in Table 3. 

Insert Table 3 about here 
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Mathematics Attitude 

The results of the analysis performed on the data from the pupil 
Mathematics Attitude Questionnaire revealed that students evaluated 
by the criterion-referenced methods received significantly higher 
mathematics attitude scores than those evaluated by the norm-referenced 
methods. 



Insert Table 4 about here 



Table 4 indicates that the type of evaluation method demonstrated a 
significant relationship with the method of evaluation used. Further 
examination of the data also indicated that in addition to the overll 
significant effect of criterion-referenced evaluation methods on the 
experimental groups, the grade of the participants produced a signifi- 
cant interaction. This interaction illustrates that the seventh grade 
participants received significantly higher mathematics attitude scores 
than their sixth grade counter parts. 
Attitude Toward Evaluation 

The date provided by the QPAO revealed no statistical significant 
differences for either criterion-referenced or norm-referenced system. 
There seemed to be no significant preference participants for either evaluation 
system as indicated by their scores on the pupil grading attitude questionnaire. 
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Summary, Conclusions and Implications 
The researchers, in this study, attempted to determine whether stu- 
dents working under a "normal curve" approach to grading or students 
under a criterion-referenced system that allows grades to be distributed 
according to relative performance and improvement in regard to predeter- 
mined performance standards would reveal greater achievement gain scores. 
The individualized criterion-referenced approach developed in this study 
was based; on criterion-referenced measurement techniques. The students 
being evaluated by this individualized criterion-referenced system were 
not restricted by a predetermined curve or normal distribution. This 
is to say, no predetermined grade quotas were set on grade categories 
nor were proportions among categories attempted. If there was a predeter- 
mined category, it was to have all students achieve the highest possible 
achievement score. The findings in this experiment led to the following 
conclusions. 1) Overall, middle school students obtained higher achieve- 
ment scores when evaluated using a criterion-referenced method. 2) High- 
er achieving middle school students were less affected by the method of 
elevation used in the instructional process than were average and lower 
achieving students. 3) Lower achieving middle school students were 
affected more by the method of evaluation used in the instructional process . 
than were higher achieving students. 4) Overall, middle school students 
evaluated by criterion-referenced methods demonstrated significantly 
more positive attitudes towards the subject under study than did those 
middle school students evaluated by norm-referenced methods. 5) Middle 
school students indicated no preference for either the criterion-referenced 
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norm-referenced methods of evaluation. 
Impl ications 

Because the middle school student is undergoing a transient period 
marked by the youngster's transition from dependence upon the family 
for security to a similar dependence on the peer group, he/she demands 
special considerations in selecting evaluative methods and techniques. 
The evaluation -system, to be most effective, should not contribute to 
negative peer group pressures and should attempt to maximize the potential 
benefits of the changing atttitudes of the transient learner. The concern of 
the middle school for the individual and his/her opportunity for individual- 
ized learning should be extended to encompass the evaluative methods of the 
school program. 

The implications of this study in regard to the effect of evaluation 
on achievement may be summarized as follows: (1) The traditional norm- 
referenced grading system restricts the academic achievement of most middle 
school students, (2) A performance-based, criterion-referenced, instruct- 
ional and evaluative strategy significantly increases the academic ability 
of middle school students when compared to traditional normative methods. 
(3) Schools must begin to individualize the evaluation segment of the 
teaching-learnint-evaluation process to best fit the academic grade level 
and abilities of each group of students and/or individuals. (4) The use 
of criterion-referenced evaluation system is of particular benefit to low 
achieving students. 

One of the strongest drives of the middle school student is the basic 
need for an identity, a belief that he/she is someone different from 
others, that this someone is important to others and that they see 
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see themselves as worthwhile. Since the middle school attempts to 
address itself to the need by maintaining an atmosphere of basic 
respect for individual differences while providing an environment 
where the opportunity to succeed is insured for all students, we believe 
that the needs of the middle school students are best served by an evaluation 
system that is individualized and criterion-referenced. The impl ications for 
the middle school clearly seem to be that taking students out of the 
competitive atmosphere of the traditional norm-referenced pupil evalu- 
ation system and placing them in a less competitive and less threatening 
atmosphere of a performance-based, criterion-referenced pupil evaluation 
system more appropriately alines the evaluative process with the basic need 
of middle school students. 

Even though we recognize that the probability for any pupil evalu- 
ation system being 100 percent effective for every teacher and every 
child is practically non-existent, the school cannot refuse to be 
accountable to the student, parent, and the community in attempting to 
realistically evaluate the educational progress of each student in 
regard to his/her own abilities. 
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Figure I 

Traditional Instructional Model 
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Figure II 
Experimental Design 



6th Grade A.M. & 7th Grade P.M. 
Classes Experimental Group 



6th Grade P.M. & 7th Grade A.M. 
Control Group 
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Grading Code 
Unsatisfactory Student did not meet basic 
level . 

Staisfactory Student did not meet basic level 
but showed improvement . 1 

Basic Skill Student met basic performance 

level (predetermined). 

Proficiency Student met and exceeded basic 
skill level and correctly 
answered 90 percent of the items, 

Mastery Student, after meeting basic 

skill level, contracted with 
the instructor and completed a 
special project demonstrating 
application of the objectives. 
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Figure III 
Traditional Grade Distribution 
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Figure IV 
Statistical Model 
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TABLE 1 

Analysis of Variance: Mathematics Achievement 



Source of 
Variance 



Degree of 
Freedom 



Sums of 
Squares 



F 

Ratio 



Pretest level 
Effect 

Achievement by 
Grade Interaction 

Level by 

Sex Interaction 



484.431 
403.685 
126.589 



15.18** 
12.65** 
3.97** 



Error Within 



76 



2425.845 



*P<lo05, ** P> .01 
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TABLE 2 



Post-Test Means of Gain Scores of Pre-Test Achievement by Level 
Interaction 



Evaluation Type 



Mean Mean 
Norm-Referenced Criterion-Referenced 
Gain Gain 



High 21.99 19.34 

Pre-test Level 



Low 



12.25 



18.95* 
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TABLE 


3 


Difference in 


Pre-test and Post-test 


Mean Scores of Low Level Students 


Mathematics 
Achievement 
Test Scores 


30 


JC(17.7) 
.^0(13.1) 




0 


(5.5)0^ 




a 


X = Criterion-referenced 
0 = Norm- referenced 
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